



## COMPUTER ORGANIZATION AND SOFTWARE SYSTEMS

BITS Pilani
Pilani Campus

Lakshmikantha G C WILP & Department of CS & IS



# Webinar-2 MIPS-Multicycle Implementation BITS Pilani Pilani Campus

#### Last Class...

- MIPS Instruction set architecture
- Datapath for
  - fetching instructions
  - updating program counter
  - implementing R-type ALU operations
  - implementing data memory unit and the sign extension unit
  - datapath for implementing branch instructions
- Effect of control signals

lead



BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956



## Effects of 7 control signals

| Signal name | Effect when deasserted                                                                             | Effect when asserted                                                                                          |  |  |
|-------------|----------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------|--|--|
| RegDst      | The register destination number for the<br>Write register comes from the rt field<br>(bits 20:16). | The register destination number for the Write register comes from the rd field (bits 15:11).                  |  |  |
| RegWrite    | None.                                                                                              | The register on the Write register input is written with the value on the Write data input.                   |  |  |
| ALUSrc      | The second ALU operand comes from the second register file output (Read data 2).                   | The second ALU operand is the sign-<br>extended, lower 16 bits of the instruction.                            |  |  |
| PCSrc       | The PC is replaced by the output of the adder that computes the value of PC + 4.                   | The PC is replaced by the output of the adder that computes the branch target.                                |  |  |
| MemRead     | None.                                                                                              | Data memory contents designated by the<br>address input are put on the Read data output.                      |  |  |
| MemWrite    | None.                                                                                              | Data memory contents designated by the<br>address input are replaced by the value on<br>the Write data input. |  |  |
| MemtoReg    | The value fed to the register Write data input comes from the ALU.                                 | The value fed to the register Write data input comes from the data memory.                                    |  |  |



## Effect of ALUop

| ALUop | Instructions                      |
|-------|-----------------------------------|
| 00 _  | Iw, sw — I - type                 |
| 01    | Beq                               |
| 10    | add, sub, and, or, slt _ R - hype |

## Execution steps: sw \$s1, 100(\$s2)

- 1. The instruction is fetched, and the PC is incremented
- 2. A register \$52 and \$51 value is read from the register file
- 3. The ALU computes the sum of the value read from the register s2 and the sign extended lower 16 bits of the instruction
  - 4. the sum from the ALU is used as the address for the data memory and writes the contents of \$s1 on to memory





# Home Work: Execution steps:beq \$\$1,\$\$2, offset

- 1. The instruction is fetched, and the PC is incremented
- 2. Two registers \$s1 and \$s2 are read from the register file
- 3. The ALU performs a subtract on the data values read from the register file. The value of PC + 4 is added to the sign extended, lower 16 bits of the instruction (offset) sifted left by two; the result is the branch target address
- 4. The zero result from the ALU is used to decide which adder result to store into the PC



## Single Cycle Advantages & Disadvantages



- Uses the clock cycle inefficiently the clock cycle must be timed to accommodate the slowest instruction
  - May be wasteful of area since some functional units (e.g., adders) must be duplicated since they can not be shared during a clock cycle but is simple and easy to understand



Lucy J. Gudino



## Multicycle datapath approach

- also called multiple clock cycle implementation
- an instruction is executed in multiple clock cycles
- Advantage: hardware sharing and ability to allow instructions to take different numbers of clock cycles
- Break up instructions into steps where each step takes a cycle while trying to
  - balance the amount of work to be done in each step
  - restrict each cycle to use only one major functional unit
- Not every instruction takes the same number of clock cycles

## Single cycle vs multicycle



- Single cycle:
  - Two memory units one for instruction and one for data
  - one ALU and two adders
- Multicycle:
  - A single memory unit for both instructions and data
    - · only one memory access per cycle
  - only a single ALU
    - only one ALU operation per cycle
  - one or more registers are added after every major functional unit to hold the output of that unit until the value is used in a subsequent clock cycle

## innovate achieve lead

## Temporary registers

- IR → Instruction Register
- MDR → Memory Data Register
- A,B→ used to hold the register operand values read from the register file
- ALUout 

  holds the output of the ALU



#### Contd...

#### Important note:

- 1. all data that is used in subsequent clock cycles must be stored in inter stage registers
- 2. Data used by subsequent instruction is stored in programmer visible registers (i.e., register file, PC, or memory)
- 3. All the registers except IR do not need a write control signal
- 4. Expand the multiplexer units

## The Five Steps

- Step 1: IF (Instruction Fetch)
- Step 2: ID (Instruction Decode)
- Step 3 : EX (Execute)
- Step 4: MEM (Memory) /
- Step 5: WB (Write Back)



## Step1: IF (Instruction Fetch)

Instruction Fetch and Update PC

## Step 2: ID (Instruction Decode)

- Following operations are performed
  - read two registers corresponding to rs and rt fields

```
A ← Reg [ IR [25:21]] γς-
B ← Reg [ IR [20:16]] γ<sub>ξ</sub>
```

compute branch target address with ALU

## Step 3: EX (Execute)

- · Four operations are possible
  - 1. Memory reference



ALUout  $\leftarrow$  A + sign-extend (IR[15:0])

- 2. Arithmetic and logical instruction
- ✓ ALUout ← A op B
- 3. Branch

BEQ

PC ← ALUout



>PC ← { PC[31:28], (IR [25:0] << 2)





## Step 4: MEM (Memory)

- Memory access or R type instruction completion step
  - 1. Memory Access
    - MDR ← Memory [ALUout] or FA
      - Memory [ALUout] ← B
  - 2. R-Type Instruction Completion
  - Reg [ IR [15: 11]] ← ALUout

## Step 5: WB (Write Back)

- Memory read completion step
  - Reg [ IR [20:16 ] ] ← MDR





## R-Type Instruction: add \$51, \$52, \$53

#### Step1: IFetch

IR 
$$\leftarrow$$
 Memory [PC] PC $\leftarrow$  PC + 4



#### Step 3: Exec

- 1. Memory reference
  - ALUout  $\leftarrow$  A + sign-extend (IR[15:0])
- 2. Arithmetic and logical instruction
  - ALUout  $\leftarrow$  A op B





4. Jump

PC ← { PC[31:28], (IR [25:0] << 2)

#### Step 5: WB

Memory read completion step

- Reg [ IR [20:16 ] ] ← MDR

#### Step 2: Dec

- nead two registers corresponding to rs and rt fields
  - A ← Reg [ IR [25:21] ]
  - B ← Reg [ IR [20:16] ]
- compute branch target address with ALU

  ALUout ← PC + (sign-extend (IR[15-0]) << 2)

#### Step 4: Mem

- 1. Memory Access
  - MDR ← Memory [ALUout]
  - or
  - Memory [ALUout] ← B
- 2. R-Type Instruction Completion Reg [ IR [15: 11]] ← ALUout



## lw Instruction: lw \$51, 100(\$52)

#### Step1: IFetch

IR  $\leftarrow$  Memory [PC] PC $\leftarrow$  PC + 4

#### Step 3: Exec

- 1. Memory reference
  - ALUout  $\leftarrow$  A + sign-extend (IR[15:0])
- 2. Arithmetic and logical instruction  $ALUout \leftarrow A$  op B
- 3.Branch
  if ( A == B ) PC ← ALUout
- 4. Jump
  PC ← { PC[31:28], (IR [25:0] << 2)

#### Step 5: WB

Memory read completion step

- Reg [ IR [20:16 ] ] ← MDR

#### Step 2: Dec

- read two registers corresponding to rs and rt fields
  - A ← Reg [ IR [25:21] ]
  - B 🗲 Reg [ IR [20:16] ]
- ★ compute branch target address with ALU
   ALUout ← PC + (sign-extend (IR[15-0]) << 2)
  </p>

#### Step 4: Mem

- 1. Memory Access
  - MDR ← Memory [ALUout]
    - or
    - Memory [ALUout] ← B
- 2. R-Type Instruction Completion Reg [ IR [15: 11]] ← ALUout

## sw Instruction: sw \$51, 100(\$52)

#### Step1: IFetch

IR  $\leftarrow$  Memory [PC] PC $\leftarrow$  PC + 4

#### Step 3: Exec

- 1. Memory reference

  ALUout ← A + sign-extend (IR[15:0])
- 2. Arithmetic and logical instruction ALUout  $\leftarrow$  A op B
- 3.Branch
  if ( A == B ) PC ← ALUout
- 4. Jump
  PC ← { PC[31:28], (IR [25:0] << 2)

#### Step 5: WB

Memory read completion step

- Reg [ IR [20 : 16 ] ] ← MDR

#### Step 2: Dec

- read two registers corresponding to rs and rt fields
  - A ← Reg [ IR [25:21] ]
  - B ← Reg [ IR [20:16] ]
- compute branch target address with ALU
   ALUout ← PC + (sign-extend (IR[15-0]) << 2)</li>

#### Step 4: Mem

- 1. Memory Access
  - MDR ← Memory [ALUout]
    - or
  - Memory [ALUout] ← B
- R-Type Instruction Completion Reg [ IR [15: 11]] ← ALUout



## beq Instruction: beq \$51, \$52, 100

#### Step1: IFetch

IR  $\leftarrow$  Memory [PC]  $PC \leftarrow PC + 4$ 

#### Step 3: Exec

1. Memory reference

ALUout  $\leftarrow$  A + sign-extend (IR[15:0])

2. Arithmetic and logical instruction  $ALUout \leftarrow A$  op B

3 Branch

if ( A == B )  $PC \leftarrow ALUout$ 

4. Jump

PC ← { PC[31:28], (IR [25:0] << 2)

#### Step 5: WB

Memory read completion step

- Reg [ IR [20:16 ] ] ← MDR

#### Step 2: Dec

read two registers corresponding to rs and rt fields

A ← Reg [ IR [25:21] ]

B ← Reg [ IR [20:16] ]

compute branch target address with ALU

ALUout ← PC + (sign-extend (IR[15-0]) << 2)

#### Step 4: Mem

1. Memory Access

MDR ← Memory [ALUout]

or

Memory [ALUout] ← B

2. R-Type Instruction Completion Reg [ IR [15: 11]] ← ALUout

CP1=3

## Summary

- R-Type: Require four cycles, CPI =4
   IF, ID, EX, WB
- Loads Require five cycles, CPI = 5
   IF, ID, EX, MEM, WB
- Store: Require four cycles, CPI = 4
   IF, ID, EX, MEM
- Branch: Require three cycles, CPI = 3
   IF, ID, EX

## Complete Data Path (1/3)







Complete Data Path (3/3)
1: PC gets

achieve lead innovate





## Actions of the 1 bit control signals

| Signal name | Effect when deasserted                                                               | Effect when asserted                                                                                                  |  |
|-------------|--------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------|--|
| RegDst      | The register file destination number for the Write register comes from the rt field. | The register file destination number for the Write register comes from the rd field.                                  |  |
| RegWrite    | None.                                                                                | The general-purpose register selected by the Write register number is written with the value of the Write data input. |  |
| ALUSrcA     | The first ALU operand is the PC.                                                     | The first ALU operand comes from the A register.                                                                      |  |
| MemRead     | None.                                                                                | Content of memory at the location specified by the Address input is put on Memory data output.                        |  |
| MemWrite    | None.                                                                                | Memory contents at the location specified by the Address input is replaced by value on Write data input.              |  |
| MemtoReg    | The value fed to the register file Write data input comes from ALUOut.               | The value fed to the register file Write data input comes from the MDR.                                               |  |
| lorD        | The PC is used to supply the address to the memory unit.                             | ALUOut is used to supply the address to the memory unit.                                                              |  |
| IRWrite     | None.                                                                                | The output of the memory is written into the IR.                                                                      |  |
| PCWrite     | None.                                                                                | The PC is written; the source is controlled by PCSource.                                                              |  |
| PCWriteCond | None.                                                                                | The PC is written if the Zero output from the ALU is also active.                                                     |  |



## Actions of the 2 bit control signals

| Signal name | Value (binary) | Effect                                                                                                                    |  |
|-------------|----------------|---------------------------------------------------------------------------------------------------------------------------|--|
| ALUOp       | 00             | The ALU performs an add operation.                                                                                        |  |
|             | 01             | The ALU performs a subtract operation.                                                                                    |  |
|             | 10             | The funct field of the instruction determines the ALU operation.                                                          |  |
| ALUSrcB     | 00             | The second input to the ALU comes from the B register.                                                                    |  |
|             | 01             | The second input to the ALU is the constant 4.                                                                            |  |
|             | 10             | The second input to the ALU is the sign-extended, lower 16 bits of the IR.                                                |  |
|             | 11             | The second input to the ALU is the sign-extended, lower 16 bits of the IR shifted left 2 bits.                            |  |
| PCSource    | 00             | Output of the ALU (PC + 4) is sent to the PC for writing.                                                                 |  |
|             | 01             | The contents of ALUOut (the branch target address) are sent to the PC for writing.                                        |  |
|             | 10             | The jump target address (IR[25:0] shifted left 2 bits and concatenated with PC + 4[31:28]) is sent to the PC for writing. |  |

## First Step



- ☐ Instruction Fetch: TFetch
- $\cdot$  IR  $\leftarrow$  Memory [PC]: 0: PC
- 1 MemRead <sub>–</sub>
- 1 IRWrite
- 🔼 IorD
- PC← PC + 4:
- ALUSrcA
- 01 ALUSrcB
  - ALUop
  - PCSource
  - PCWrite

0: First operand is in PC

1: First operand is in register A

Second operand

00: A register

01:4

1: ALUout

10:Sign Extd lower 16 bits of IR

11 :Sign Extd lower 16 bits of IR << 2

#### Contd...





#### ☐ Instruction Fetch:

- IR  $\leftarrow$  Memory [PC]:
- MemRead
- IRWrite
- o IorD
  - PC← PC + 4:
- ALUSrcA
- 01 ALUSrcB
- oo ALUop
- oo PCSource
  - PCWrite

00: Add operation

01: Subtract Operation

10: Function field of the instruction

determines the ALU operation

00: PC + 4 output of ALU

01: branch target address ALUout

10:jump target address

## Second Step

0: First operand is PC

1: First operand is register



lead

achieve

Instruction decode and rg

- read two registersfields
- compute branch

A ← Reg [ IR [/ ∠1]

B ← Reg [ IR/ 0:16]]

ALUout ← P/ + (sign-g

Second operand

00: register

01:4

10:Sign Extd lower 16 bits of IR

11 :Sign Extd lower 16 bits of IR << 2

(IR[15-0]) << 2)

ALUSrcA

**ALUSrcB** 

00: Add

01: Subtract

10:Function field determines the

**ALU** operation

00 ALUor

11



Step 3

## Execution, memory address computation, or branch completion

- Four operations are possible
- Memory reference generation
   ALUout ← A + sign-extend (IR[15:0])
- 2. Arithmetic and logical instruction  $ALUout \leftarrow A$  op B
- 3.Branch
  if ( A == B ) PC ← ALUout
- 4. Jump
  PC ← { PC[31:28], (IR [25:0] << 2)



0: First operand is PC

1. Memory reference 1: First operand is register A

ALUout A + sign-extens (IR[15:0])

1 - ALUSrcA

Second operand

- 00: register
- 01:4
- 10:Sign Extd lower 16 bits of IR
- 11 :Sign Extd lower 16 bits of IR << 2

10 - ALUSrcB

00 - ALUop

00: Add operation

**01: Subtract Operation** 

10:Function field of the instruction

determines the ALU operation





#### 2. Arithmetic and logical instruction

ALUout ← A op B

ALUSrcA

0: First operand is PC

1: First operand is register

**ALUSrcB** 

00: register

01:4

10:Sign Extd lower 16 bits of IR

11 :Sign Extd lower 16 bits of IR << 2

ALUop. 10

00: Add operation

01: Subtract Operation

10: Function field of the instruction

determines the ALU operation

- 0: First operand is PC
- 1: First operand is register



3. Branch

1 - ALUSrcA

- 00: register
- 01:4
- 10:Sign Extd lower 16 bits of IR
- 11 :Sign Extd lower 16 bits of IR << 2

- 00 ALUSrcB
- 01 ALUop
- 1 PCWrite

- 00: Add operation
- 01: Subtract Operation(beq)
- 10: Function field of the instruction
- determines the ALU operation
- 00: PC + 4 output of ALU
- 01: branch target address ALUout
- 10:jump target address

01 - PCSource





#### 4. Jump

 $PC \leftarrow \{ PC[31:28], (IR [25:0] << 2) \}$ 

10 - PCSource

1 - PCWrite

00: PC + 4 output of ALU

01: branch target address ALUout

10:jump target address

## Step 4

- Memory access or R type instruction completion step
  - Two operations are possible
    - Memory reference
       MDR ← Memory [ALUout]
       or
       Memory [ALUout] ← B
    - Arithmetic and logical instruction
       Reg [ IR [15: 11]] ← ALUout





```
Memory reference
```

lw

MDR ← Memory [ALUout]

- MemRead
- 1 IorD

Memory [ALUout] ← B

- MemWrite
- **I**orD



## Step 4 contd...



Arithmetic and logical instruction

Reg [ IR [15: 11]] ← ALUout

- RegDst\_
- RegWrite
- MemtoRég



## Step 5



Write back or Memory read completion step

- Reg [ IR [20:16] ] ← MDR

RegWrite \_\_

RegDst\_





## Summary

| Step name                                              | Action for R-type instructions                                                           | Action for memory-<br>reference instructions                     | Action for<br>branches       | Action for jumps                         |  |
|--------------------------------------------------------|------------------------------------------------------------------------------------------|------------------------------------------------------------------|------------------------------|------------------------------------------|--|
| Instruction fetch                                      | IR <= Memory[PC] PC <= PC + 4                                                            |                                                                  |                              |                                          |  |
| Instruction decode/register fetch                      | A <= Reg [IR[25:21]]  B <= Reg [IR[20:16]]  ALUOut <= PC + (sign-extend (IR[15:0]) << 2) |                                                                  |                              |                                          |  |
| Execution, address computation, branch/jump completion | ALUOut <= A op B                                                                         | ALUOut <= A + sign-extend<br>(IR[15:0])                          | if (A === B)<br>PC <= ALUOut | PC <= {PC [31:28],<br>(IR[25:0]],2'b00)} |  |
| Memory access or R-type completion                     | Reg [IR[15:11]] <=<br>ALUOut                                                             | Load: MDR <= Memory[ALUOut]<br>or<br>Store: Memory [ALUOut] <= B |                              |                                          |  |
| Memory read completion                                 |                                                                                          | Load: Reg[IR[20:16]] <= MDR                                      |                              |                                          |  |



